🌻 Hierarchical coding#

22 Sep 2025

Simplifying large causal maps with hierarchical labels (zooming)#

image.png

Summary#

When you code lots of sources, you quickly end up with too many near-duplicate causes and effects (and therefore too many nodes on the map). Hierarchical factor labels let you keep the detail and produce a smaller “summary map” by “zooming out” to a higher level.

In the Causal Map app we write hierarchies using the separator ;, but that’s just one convenient encoding. The important thing is the meaning: you’re saying “this is a specific case of that, and I’m happy to report it at the higher level if needed”.

Example (using ; to encode “general; specific”):

New intervention; midwife training ➜ Healthy behaviour; hand washing

Read it as:

You can extend this to multiple levels:

New intervention; midwife training; hand washing instructions

Crucially, higher-level labels can also be used directly for coding. So you can code both detailed links and higher-level links, depending on what the text supports.

What problem this solves (for practitioners)#

If you are coding a handful of interviews, you can “just remember” what all your labels mean. But as soon as you code dozens or hundreds of sources, a common workflow problem appears:

Hierarchical labels are a practical compromise: keep the detailed evidence, but still be able to summarise cleanly.

The idea in one minute#

Write detailed factors as:

General concept ; specific concept

If you code:

New intervention; midwife training ➜ Healthy behaviour; hand washing

…you can later zoom out and treat this as also supporting:

New intervention ➜ Healthy behaviour

This gives you a smaller map that is easier to read and present, without deleting the detailed evidence underneath.

How to write hierarchical labels (coding convention)#

In the app, the convention is:

General concept; specific concept

Two practical conveniences follow immediately:

A practical heuristic#

When you choose a parent label, pick something you would genuinely be comfortable reporting at a higher level. If “rolling up” the specific item into the parent would mislead a reader, don’t put it in that hierarchy.

AI-assisted coding tip (optional)#

If you are using AI to extract candidate links, a simple prompt constraint often helps: ask for factor labels using the template “general; specific”, and require a verbatim quote for every link so the evidence remains checkable.

Zooming out (what it does)#

image.png

Assuming you have a causal map that uses hierarchical labels (like the small map shown above), you can “zoom out” by rewriting factor labels to a chosen maximum depth and then redrawing the map.

In practice:

Examples (contrast) from the app#

The easiest way to see what zooming is doing is to compare the same “top factors” view with and without zooming:

Top factors (no zoom) (bookmark 983)

Top factors (with zoom) (bookmark 984)

Don’t treat this mechanically: beware the Transitivity Trap#

Warning: causal mapping here is a qualitative process. Zooming is a powerful summarisation tool, but it only makes sense when your hierarchy choices really do preserve the intended meaning.

Looking at a causal map on its own never lets you safely read off coherent narrative stories along long pathways (the “transitivity trap”). If you want pathway interpretations that reflect single-source stories, use source tracing first, then apply zooming/collapsing for presentation.

See:

Zooming out is like a licensed “deduction family”#

If you have:

New intervention; midwife training ➜ Healthy behaviour; hand washing

…then (loosely yet informatively, with caveats) you are licensing the summary interpretations:

This reflects the familiar granularity dilemma: how much detail should you code on the cause side and/or effect side? Hierarchies let you keep detail while explicitly stating what higher-level roll-ups are acceptable.

Why this matters in practice#

Hierarchical labels + zooming let you do both of these:

Typical payoffs:

Two common scenarios:

Guardrails (useful rules of thumb)#

Hierarchies work well when the hierarchy is meaningful. These guardrails help you avoid misleading roll-ups:

Label factors as events or changes if you can#

Hierarchies work best when the parent labels are themselves meaningful causal factors (often semi-quantitative), not just headings or categories. Good examples:

Here, Social problems and Psychosocial stressors are higher-level causal factors in their own right; they are not just themes or boxes to put factors into.

So we might have:

“The problem of unemployment is a psychosocial stress for many”

Experiencing social problems; Unemployment ➜ Experiencing psychosocial stressors

“When people get stressed they often turn to drugs“

Experiencing psychosocial stressors ➜ Experiencing social problems; Addiction

These could be combined into this story:

Experiencing social problems; Unemployment ➜ Experiencing psychosocial stressors ➜ Experiencing social problems; Addiction

If we zoom out of the above story, we could focus on the higher-level factors and in this case we would get a vicious cycle:

Experiencing social problems ➜ Experiencing psychosocial stressors ➜ Experiencing social problems

When not to use a hierarchy (use a tag instead)#

Don’t use ; to express a non-causal topic theme. A; B explicitly licenses: “for summary purposes, treat this as A”, so A must be something you would treat as a causal factor, not just a theme label like “Health”.

For theme-style grouping, use tags (hashtags, brackets, or any consistent convention) inside the label, e.g.:

This keeps the causal claim minimalist (X ➜ Y) while still letting you filter/search by theme.

Hierarchical coding as a way of coping with lots of factors#

Analysts are often faced with the quandary of either having too many factors to keep track of, or merging them into a smaller number of factors and losing information. With hierarchies, you can have your cake and eat it: it’s similar to recoding an unwieldy number of factors into a smaller number of less granular items, but with the advantage that the process is reversible. The information can be viewed from the new higher level and from the original, more granular level.

For example, you can count that the higher-level component “Healthier behaviour” was mentioned ten times, while retaining the information about the individual mentions of its more granular components.

Re-usable factor components as hashtags#

Sometimes your factors relate to each other in ways which are not just hierarchical. For example:

These are three (hierarchical) factors in which “Health” appears in different places, and at different levels of the hierarchy.

This is not ideal, but sometimes it’s the best way to organise a tricky set of factors.

In this example, “Health” appears only as a component of other factors. Although on its own it might look like a mere theme rather than a causal factor, it plays a role in differentiating the causal factors in which it participates (e.g. “Activities completed, in particular training, in particular on health”). And because “Health” is used across hierarchies, it can also be treated as a hashtag and can be used as part of searches, lists and counts of factors.

Isn’t that a contradiction? Didn’t we just say that “Health” is not to be used on its own as a factor because it is just a theme and is not expressed in a semi-quantitative way? No: here the word “Health” is not functioning only as a theme but as a way of differentiating causal factors. The actual factor labels in which it participates are still expressed in a semi-quantitative way. So Activities completed; Training; Health is intended as a causal factor about the extent of completion of activities, in particular training activities, and in particular health training activities.

Formal notes (optional)#

One clean way to define “zoom level” is by counting components:

See also#